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RAW SEQUENCE LISTING 

PATENT APPLICATION US/08/836,075 



DATE: 10/08/97 
TIME: 13:05:11 



INPUT SET: S20848.mw 



General Information 



This Raw Listing contains the General 
Information Section and those Sequences 
containing ERRORS. ^ £). 



SEQUENCE LISTING 



(i) APPLICANT: MAERTENS , GEERT 
STUYVER, LIEVEN 



(ii) TITLE OF INVENTION: NEW SEQUENCES OF HEPATITIS C VIRUS GENOTYPES 
AND THEIR USE AS PROPHYLACTIC, THERAPEUTIC AND DIAGNOSTIC 
AGENTS 



(iii) NUMBER OF SEQUENCES: 207 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ARNOLD, WHITE & DURKEE 

(B) STREET: P.O. BOX 4433 

(C) CITY: HOUSTON 

(D) STATE: TEXAS 

(E) COUNTRY: USA 

(F) ZIP: 77210-4433 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Microsoft Word 6.0 / ASCII text output 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/836,075 

(B) FILING DATE: 21 Apr 1997 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/EP.95/04 155 

(B) FILING DATE: 23 Oct 1995 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: EP 94 870166.9 

(B) FILING DATE: 21 Oct 1994 

o 

(viii) PRIOR APPLICATION DATA: 

(A) APPLICATION MEMBER : EP 95870076.7 

(B) FILING DATE: 28 Jun 1995 

(ix) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: KAMMERER, PATRICIA A. 
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RAW SEQUENCE LISTING 

PATENT APPLICATION US/08/836,075 



DATE: 10/08/97 
TIME: 13:05:13 



INPUT SET: S20848.raw 

46 (B) REGISTRATION NUMBER: 29,775 

47 (C) REFERENCE / DOCKET NUMBER: INNS:004 
48 



ERRORED SEQUENCES FOLLOW: 



893 (2) INFORMATION FOR SEQ ID NO: 22: 

894 s-J 

895 /l7 (i) SEQUENCE CHARACTERISTICS: 

" — > 896 (A) LENGTH: 48 amino acids 

— > 897 AAy^ (B) TYPE: amino acid 

— > 898 £r (D) TOPOLOGY: linear 

899 

900 (ii) MOLECULE TYPE: peptide 

901 

902 

903 

— > 904 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

905 

906 Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 

907 15 10 15 
908 

909 Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Arg Ser Leu Ala 

910 20 25 30 
911 

912 Glu Tyr Thr Cys Ala Arg Arg Gly Lys Leu Arg Arg Ser Ser Met Gly 

913 35 40 45 
914 

915 

950 (2) INFORMATION FOR SEQ ID NO: 24: 
951 

952 (i) SEQUENCE CHARACTERISTICS: 

— > 953 (A) LENGTH: 149 amino acids 

954 (B) TYPE: amino acid /QjUL y^JL/^t /2jMA- 

955 (D) TOPOLOGY: linear * I o 
956 

957 (ii) MOLECULE TYPE: peptide 

958 

959 

960 

961 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

962 

96 3 Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly- Cys «8er Phe Ser 

964 15 10 15 

965 & 

966 lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala 

967 20 25 3 30 
968 

969 Val Gin Val Lys Asn Thr Ser His Ser Tyr Met Vai Thr Asn Asp Cys 
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RAW SEQUENCE LISTING 

PATENT APPLICATION US/08/836, 075 



DATE: 10/08/97 
TIME: 13:05:15 



INPUT SET: S20848.raw 
970 35 40 45 

971 

972 Ser Asn Ser Ser lie Val Trp Gin Leu Lys Asp Ala Val Leu His Val 

973 50 55 60 
974 

975 Pro Gly Cys Val Pro Cys Glu Arg His Gin Asn Gin Ser Arg Cys Trp 

976 65 70 75 80 
977 

978 lie Pro Val Thr Pro Asn Val Ala Val Ser Gin Pro Gly Ala Leu Thr 

979 85 90 o 95 

980 r\[ 

981- Arg Gly Leu Arg Thr His lie Asp Thr He Val Ala /eri Ala Thr Val 1 

982 " 100 105 . KJ 110 

983 , - - ■ - . . 

984 Cys Ser Ala Leu Tyr Val Gly Asp Phe Cys Gly Ala Val Met Leu Val 

985 115 120 125 
986 

987 Ser Gin Phe Phe Met He Ser Pro Gin His His He Phe Val Gin Asp 

988 130 135 140 
989 

990 Cys Asn Cys Ser He 

991 145 
992 

1161 (2) INFORMATION FOR SEQ ID NO: 30: 
1162 

116 3 (i) SEQUENCE CHARACTERISTICS: 

— > 1164 (A) LENGTH: 149 amino acids . 

1165 (B) TYPE: amino acid A a$ yrSiC^ Z 2 ^^ 

1166 (D) TOPOLOGY: linear ^^^^ 0 
1167 

1168 (ii) MOLECULE TYPE: peptide 

1169 

1170 

1171 

1172 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

1173 

1174 Asp Gly He Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser 

1175 15 10 15 
1176 

1177 He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala 

1178 20 25 30 
1179 

1180 He Asn Tyr Arg Asn Val Ser Gly He Tyr Tyr Val Thr Asn Asp Cys 

1181 35 40 45 
1182 

- 118e3 Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Leu His Leu 

1184 50 55 60 

& 1185 

1186 Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Gin Ser Arg Cys Trp 

e 1187 65 70 75 80 

: 1188 

I 1189 Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr He Gly Ala Pro Leu 
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— > 



1190 
1191 
1192 
1193 
1194 
1195 
1196 
1197 
1198 
1199 
1200 
1201 
1202 
1203 



RAW SEQUENCE LISTING 

PATENT APPLICATION US/08/836,075 



DATE: 10/08/97 
TIME: 13:05:17 



85 



90 



INPUT SET: S20848.raw 
95 



Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr Val 
100 105 110 



Cys Ser Ala Leu Tyr lie Gly Asp Xaa Cys 

115 120 



^Ca^Gly Le 



Leu Phe Leu Val 

25 



Gly Gin Met Phe Ser Phe Arg Pro Arg Arg His Trp Thr Thr Gin Asp 
130 135 140 

Cys Asn Cys Ser lie 
145 



— > 



— > 



1204 
1205 
1206 
1207 
1208 
1209 
1210 
1211 
1212 
1213 
1214 
1215 
1216 
1217 
1218 
1219 
1220 
1221 
1222 
1223 
1224 
1225 
1226 
1227 
1228 
1229 
1230 
1231 
1232 
1233 
1234 
1235 
1236 



(2) INFORMATION FOR SEQ ID NO: 31: 



(i) SEQUENCE CHAE 



SRISTICS: 



(A) LENGTH: (447/base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
GACGGGATCA ATTATGCAAC AGGGAACCTT CCCGGTTGCT CTTTTTCTAT CTTCCTCTTG 
GCACTCCTCT CGTGCCTGAC TGTTCCCGCT TCGGCCATTA ACTACCGCAA CACCTCGGGC 
ATCTACCACG TCACCAATGA CTGCCCGAAC TCGAGCATAG TTTATGAGGC CGACCACCAC 
ATCTTGCACC TTCCAGGTTG CGTGCCCTGC GTGAGAACTG GGAATCAGTC ACGTTGCTGG 
GTGGCCCTTA CTCCTACCGT CGCAGCGCCA TACATCGGCG CACCGCTTGA GTCTCTGCGG 
AGTCATGTGG ATCTGATGGT GGGGGCTGCC ACTGTTTGCT CAGCCCTTTA CATCGGGGAT 
TTGTGTGCCG GCTTGTTCTT GGTTGGTCAG ATGTTTTCTT TCCQACCACCG ACGCCACTGG 
ACTGCCCAGG ATTGCAATTG TTCTATC "~ 



60 
120 
180 
240 
300 
360 
T20 
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SEQUENCE VERIFICATION REPORT 

PATENT APPLICATION US/08/836,075 



DATE: 10/08/97 
TIME: 13:05:21 



INPUT SET: S20848.raw 



Line 


Error 


flnoiniil ' 1 V* y t 
vl lglllol 1 CAL 


866 


Fntered nifft and Calc Sen I^eneth (Qt\ differ 


(A\ LENGTH* 310 base Dairs 


896 


Unknown or Misplaced Identifier 


(A) LENGTH: 48 amino acids 


897 


Unknown or Misplaced Identifier 


(B) TYPE: amino acid 


898 


Unknown or Misplaced Identifier 


(D) TOPOLOGY: linear 


904 


Wrong Or Missing Strandedness Value 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 


904 


Wrong or Missing Sequence Topology 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 


953 


Entered (149) and Calc. Seq. Length (148) differ 


(A) LENGTH: 149 amino acids 


981 


Wrong Amino Acid Designator 


Arg Gly Leu Arg Thr His He Asp Thr He Val Ala er Ala Th 


981 


Wrong Amino Acid Designator 


* " "Arg Gly Leu Arg Thr His He Asp Thr He Val Ala er Ala Th 


1164 


Entered (149) and Calc. Seq. Length (148) differ 


(A) LENGTH: 149 amino acids 


1195 


Wrong Amino Acid Designator 


Cys Ser Ala Leu Tyr He Gly Asp Xaa Cys Xa Gly Leu Phe 


1207 


Entered (447) and Calc. Seq. Length (448) differ 


(A) LENGTH: 447 base pairs 


1234 


# of Sequences for line conflicts w/ running total 


TTGTGTGGCG GCTTGTTCTT GGTTGGTCAG ATGTTT 



4> 



BIOTECHNOLOGY *B 



SYSTEMS 
BRANCH 




Notice of Availability of 
Checker Program 

Applicant Aid for Biotechnology Computer Readable Form (CRF) 
Sequence Listing Submissions 

The Patent and Trademark Office (PTO) has developed a computer program, called Checker, that 
will aid applicants in identifying and correcting errors prior to making submissions for compliance 
with the Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid 
Sequence Disclosures (Sequence Rules: 37CRF 1.821 through 1.825). 

Final rules were published in the Federal Register (55 FR18230) on May 1, 1990, and in the PTO 
Official Gazette (1 1 14 Off.GazPatOffice 29) on May 15, 1990. 

Checker is a DOS-based software program that is intended to assist users in determining whether 
errors may be present in the sequence listings, and is not intended to guarantee that the submission 
is error-free. 

The most current version of the software is available via computer downloading, details are below. 
Copies on diskette are also available. Updated software versions will not be automatically mailed 
out; any updates will be announced in the PTO Official Gazette. 

The software can be accessed/requested from the following locations: 

1) Dial-up access through the Internet. Location is ftp://ftp.uspto.gov 
The software is in current directory: pub/checker/ 
Download all the files. Cost: Free-of-charge 

3) For diskette~copies, mail to: U.S.P.T.O., OEIP, CRYSTAL PARK 3, SUITE 441 

WASHINGTON DC 20231 

COST FOR DISKETTE IS S 25.00 

METHOD OF PAYMENT: 

Check payable to Commissioner of Patents and Trademarks 
VISA/ Mastercard/ Charge- Charges can be faxed to 703-306-2737 
PTO Deposit Account 



For Further Information, Contact: Afti Shah at 703-308-4212 



